Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull it all #156

Closed
wants to merge 155 commits into from
Closed

Conversation

xMooz
Copy link

@xMooz xMooz commented Jan 17, 2015

racist

Noltari pushed a commit to Noltari/linux that referenced this pull request Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
damentz referenced this pull request in zen-kernel/zen-kernel Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 #85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ #156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Sep 6, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
otyshchenko1 referenced this pull request in otyshchenko1/linux Oct 2, 2019
…shutdown

This patch is used for fixing 'irq nobody care' issue during reboot

How to reproduce:
1)Prepare weston enabled environment
2)Connect USB mouse
3)Read input from the mouse and reboot
$ od -tx /dev/input/event0 &
$ reboot
4)Move the mouse while system shutdown
Don't need to move the mouse after "reboot: Restarting system"
5)Repeat step 3 and step 4 until below error occurs

Error log:
usb 2-1: USB disconnect, device number 2
irq 156: nobody cared (try booting with the "irqpoll" option)
Workqueue: usb_hub_wq hub_event
Call trace:
...
usbhid_disconnect+0x4c/0x78
usb_unbind_interface+0x6c/0x2a8
device_release_driver_internal+0x174/0x208
device_release_driver+0x14/0x20
bus_remove_device+0x114/0x128
device_del+0x1ac/0x300
usb_disable_device+0x8c/0x200
usb_disconnect+0xb4/0x218
...
handlers:
usb_hcd_irq
Disabling IRQ xen-troops#156

This issue occurs due to race condition between ohci_irq()
interrupt handler and ohci_shutdown()
Adding spin_lock_irq() to prevent interrupt raising while ohci is shutting
down can fix this issue.

When host controller dies, lock will be held by io_watchdog_func before
ohci_shutdown, so locking should be skipped in this case to prevent
deadlock

Signed-off-by: Tho Vu <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Nov 23, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
[bwh: Backported to 3.16:
 - Drop change in io_watchdog_func()
 - Adjust context]
Signed-off-by: Ben Hutchings <[email protected]>
chewitt pushed a commit to chewitt/linux that referenced this pull request Dec 2, 2019
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Copy link

@erikterwiel erikterwiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

AndrewSmart referenced this pull request in tgstation/tgstation Apr 17, 2020
Further changes to comply with the ToS

More ToS cleanup

Bye bye George Melons

Better to be safe than sorry
fengguang pushed a commit to 0day-ci/linux that referenced this pull request May 8, 2020
Fix the following checkpatch warnings and errors:

ERROR: do not initialise statics to 0
torvalds#14: FILE: drivers/platform/mips/cpu_hwmon.c:14:
+static int csr_temp_enable = 0;

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#60: FILE: drivers/platform/mips/cpu_hwmon.c:60:
+static SENSOR_DEVICE_ATTR(name, S_IRUGO, get_hwmon_name, NULL, 0);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#84: FILE: drivers/platform/mips/cpu_hwmon.c:84:
+static SENSOR_DEVICE_ATTR(temp1_input, S_IRUGO, get_cpu_temp, NULL, 1);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#85: FILE: drivers/platform/mips/cpu_hwmon.c:85:
+static SENSOR_DEVICE_ATTR(temp1_label, S_IRUGO, cpu_temp_label, NULL, 1);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#86: FILE: drivers/platform/mips/cpu_hwmon.c:86:
+static SENSOR_DEVICE_ATTR(temp2_input, S_IRUGO, get_cpu_temp, NULL, 2);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#87: FILE: drivers/platform/mips/cpu_hwmon.c:87:
+static SENSOR_DEVICE_ATTR(temp2_label, S_IRUGO, cpu_temp_label, NULL, 2);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#88: FILE: drivers/platform/mips/cpu_hwmon.c:88:
+static SENSOR_DEVICE_ATTR(temp3_input, S_IRUGO, get_cpu_temp, NULL, 3);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#89: FILE: drivers/platform/mips/cpu_hwmon.c:89:
+static SENSOR_DEVICE_ATTR(temp3_label, S_IRUGO, cpu_temp_label, NULL, 3);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#90: FILE: drivers/platform/mips/cpu_hwmon.c:90:
+static SENSOR_DEVICE_ATTR(temp4_input, S_IRUGO, get_cpu_temp, NULL, 4);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#91: FILE: drivers/platform/mips/cpu_hwmon.c:91:
+static SENSOR_DEVICE_ATTR(temp4_label, S_IRUGO, cpu_temp_label, NULL, 4);

WARNING: Missing a blank line after declarations
torvalds#120: FILE: drivers/platform/mips/cpu_hwmon.c:120:
+	int id = (to_sensor_dev_attr(attr))->index - 1;
+	return sprintf(buf, "CPU %d Temperature\n", id);

WARNING: Missing a blank line after declarations
torvalds#128: FILE: drivers/platform/mips/cpu_hwmon.c:128:
+	int value = loongson3_cpu_temp(id);
+	return sprintf(buf, "%d\n", value);

ERROR: spaces required around that '=' (ctx:VxV)
torvalds#135: FILE: drivers/platform/mips/cpu_hwmon.c:135:
+	for (i=0; i<nr_packages; i++)
 	      ^

ERROR: spaces required around that '<' (ctx:VxV)
torvalds#135: FILE: drivers/platform/mips/cpu_hwmon.c:135:
+	for (i=0; i<nr_packages; i++)
 	           ^

ERROR: spaces required around that '=' (ctx:VxV)
torvalds#145: FILE: drivers/platform/mips/cpu_hwmon.c:145:
+	for (i=0; i<nr_packages; i++)
 	      ^

ERROR: spaces required around that '<' (ctx:VxV)
torvalds#145: FILE: drivers/platform/mips/cpu_hwmon.c:145:
+	for (i=0; i<nr_packages; i++)
 	           ^

ERROR: spaces required around that '=' (ctx:VxV)
torvalds#156: FILE: drivers/platform/mips/cpu_hwmon.c:156:
+	for (i=0; i<nr_packages; i++) {
 	      ^

ERROR: spaces required around that '<' (ctx:VxV)
torvalds#156: FILE: drivers/platform/mips/cpu_hwmon.c:156:
+	for (i=0; i<nr_packages; i++) {
 	           ^

WARNING: line over 80 characters
torvalds#175: FILE: drivers/platform/mips/cpu_hwmon.c:175:
+		csr_temp_enable = csr_readl(LOONGSON_CSR_FEATURES) & LOONGSON_CSRF_TEMP;

total: 7 errors, 12 warnings, 231 lines checked

Signed-off-by: Tiezhu Yang <[email protected]>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request May 9, 2020
Fix the following checkpatch warnings and errors:

ERROR: do not initialise statics to 0
torvalds#14: FILE: drivers/platform/mips/cpu_hwmon.c:14:
+static int csr_temp_enable = 0;

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#60: FILE: drivers/platform/mips/cpu_hwmon.c:60:
+static SENSOR_DEVICE_ATTR(name, S_IRUGO, get_hwmon_name, NULL, 0);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#84: FILE: drivers/platform/mips/cpu_hwmon.c:84:
+static SENSOR_DEVICE_ATTR(temp1_input, S_IRUGO, get_cpu_temp, NULL, 1);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#85: FILE: drivers/platform/mips/cpu_hwmon.c:85:
+static SENSOR_DEVICE_ATTR(temp1_label, S_IRUGO, cpu_temp_label, NULL, 1);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#86: FILE: drivers/platform/mips/cpu_hwmon.c:86:
+static SENSOR_DEVICE_ATTR(temp2_input, S_IRUGO, get_cpu_temp, NULL, 2);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#87: FILE: drivers/platform/mips/cpu_hwmon.c:87:
+static SENSOR_DEVICE_ATTR(temp2_label, S_IRUGO, cpu_temp_label, NULL, 2);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#88: FILE: drivers/platform/mips/cpu_hwmon.c:88:
+static SENSOR_DEVICE_ATTR(temp3_input, S_IRUGO, get_cpu_temp, NULL, 3);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#89: FILE: drivers/platform/mips/cpu_hwmon.c:89:
+static SENSOR_DEVICE_ATTR(temp3_label, S_IRUGO, cpu_temp_label, NULL, 3);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#90: FILE: drivers/platform/mips/cpu_hwmon.c:90:
+static SENSOR_DEVICE_ATTR(temp4_input, S_IRUGO, get_cpu_temp, NULL, 4);

WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
torvalds#91: FILE: drivers/platform/mips/cpu_hwmon.c:91:
+static SENSOR_DEVICE_ATTR(temp4_label, S_IRUGO, cpu_temp_label, NULL, 4);

WARNING: Missing a blank line after declarations
torvalds#120: FILE: drivers/platform/mips/cpu_hwmon.c:120:
+	int id = (to_sensor_dev_attr(attr))->index - 1;
+	return sprintf(buf, "CPU %d Temperature\n", id);

WARNING: Missing a blank line after declarations
torvalds#128: FILE: drivers/platform/mips/cpu_hwmon.c:128:
+	int value = loongson3_cpu_temp(id);
+	return sprintf(buf, "%d\n", value);

ERROR: spaces required around that '=' (ctx:VxV)
torvalds#135: FILE: drivers/platform/mips/cpu_hwmon.c:135:
+	for (i=0; i<nr_packages; i++)
 	      ^

ERROR: spaces required around that '<' (ctx:VxV)
torvalds#135: FILE: drivers/platform/mips/cpu_hwmon.c:135:
+	for (i=0; i<nr_packages; i++)
 	           ^

ERROR: spaces required around that '=' (ctx:VxV)
torvalds#145: FILE: drivers/platform/mips/cpu_hwmon.c:145:
+	for (i=0; i<nr_packages; i++)
 	      ^

ERROR: spaces required around that '<' (ctx:VxV)
torvalds#145: FILE: drivers/platform/mips/cpu_hwmon.c:145:
+	for (i=0; i<nr_packages; i++)
 	           ^

ERROR: spaces required around that '=' (ctx:VxV)
torvalds#156: FILE: drivers/platform/mips/cpu_hwmon.c:156:
+	for (i=0; i<nr_packages; i++) {
 	      ^

ERROR: spaces required around that '<' (ctx:VxV)
torvalds#156: FILE: drivers/platform/mips/cpu_hwmon.c:156:
+	for (i=0; i<nr_packages; i++) {
 	           ^

WARNING: line over 80 characters
torvalds#175: FILE: drivers/platform/mips/cpu_hwmon.c:175:
+		csr_temp_enable = csr_readl(LOONGSON_CSR_FEATURES) & LOONGSON_CSRF_TEMP;

Signed-off-by: Tiezhu Yang <[email protected]>
samueldr pushed a commit to samueldr/linux that referenced this pull request Jun 28, 2020
…a devs

commit 33f9e02 upstream.

Enabling parport pc driver on a B2600 (and probably other 64bit PARISC
systems) produced following BUG:

CPU: 0 PID: 1 Comm: swapper Not tainted 4.12.0-rc5-30198-g1132d5e torvalds#156
task: 000000009e050000 task.stack: 000000009e04c000

     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001101111111100001111 Not tainted
r00-03  000000ff0806ff0f 000000009e04c990 0000000040871b78 000000009e04cac0
r04-07  0000000040c14de0 ffffffffffffffff 000000009e07f098 000000009d82d200
r08-11  000000009d82d210 0000000000000378 0000000000000000 0000000040c345e0
r12-15  0000000000000005 0000000040c345e0 0000000000000000 0000000040c9d5e0
r16-19  0000000040c345e0 00000000f00001c4 00000000f00001bc 0000000000000061
r20-23  000000009e04ce28 0000000000000010 0000000000000010 0000000040b89e40
r24-27  0000000000000003 0000000000ffffff 000000009d82d210 0000000040c14de0
r28-31  0000000000000000 000000009e04ca90 000000009e04cb40 0000000000000000
sr00-03  0000000000000000 0000000000000000 0000000000000000 0000000000000000
sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000

IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000404aece0 00000000404aece4
 IIR: 03ffe01f    ISR: 0000000010340000  IOR: 000001781304cac8
 CPU:        0   CR30: 000000009e04c000 CR31: 00000000e2976de2
 ORIG_R28: 0000000000000200
 IAOQ[0]: sba_dma_supported+0x80/0xd0
 IAOQ[1]: sba_dma_supported+0x84/0xd0
 RP(r2): parport_pc_probe_port+0x178/0x1200

Cause is a call to dma_coerce_mask_and_coherenet in parport_pc_probe_port,
which PARISC DMA API doesn't handle very nicely. This commit gives back
DMA_ERROR_CODE for DMA API calls, if device isn't capable of DMA
transaction.

Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Helge Deller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
chombourger pushed a commit to chombourger/linux that referenced this pull request Feb 16, 2021
…from plsdk-2729-v6 to processor-sdk-linux-4.19.y

* commit '8bb5766b8f065c3142ec2c6ae5bfdb82ce175d17':
  net: phy: dp83867: add workaround for incorrect default FLD threshold
  net: ethernet: ti: icssg_prueth: disable ptp for dual icssg
  arm64: dts: k3-am654-idk: add an overlay to support interposer card
  net: ti: icssg_prueth: Enhance the driver to support an icssg pair
  dt-bindings: net: ti,icssg-prueth: update for using icssg pair
Sylfrena pushed a commit to Sylfrena/linux that referenced this pull request Apr 4, 2021
PowerPC: GCC, docs, release config, more CI, avoid recompilation of `rust/`...
roxell pushed a commit to roxell/linux that referenced this pull request Apr 27, 2021
Fixes a checkpatch warning:

  WARNING: Possible comma where semicolon could be used
  torvalds#156: FILE: drivers/reset/sti/reset-syscfg.c:156:
  +	rc->rst.ops = &syscfg_reset_ops,
  +	rc->rst.of_node = dev->of_node;

Signed-off-by: Philipp Zabel <[email protected]>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request May 11, 2021
Fixes a checkpatch warning:

  WARNING: Possible comma where semicolon could be used
  torvalds#156: FILE: drivers/reset/sti/reset-syscfg.c:156:
  +	rc->rst.ops = &syscfg_reset_ops,
  +	rc->rst.of_node = dev->of_node;

Signed-off-by: Philipp Zabel <[email protected]>
akiernan pushed a commit to zuma-array/linux that referenced this pull request Nov 3, 2022
PD#150071: hdmitx: driver defect clean up:
#2
torvalds#156

Change-Id: Icf9d9d0cd112344d9981ed33171b04f744930808
Signed-off-by: Zongdong Jiao <[email protected]>
Signed-off-by: Yi Zhou <[email protected]>
akiernan pushed a commit to zuma-array/linux that referenced this pull request Nov 4, 2022
PD#150071: hdmitx: driver defect clean up:
#2
torvalds#156

Change-Id: Icf9d9d0cd112344d9981ed33171b04f744930808
Signed-off-by: Zongdong Jiao <[email protected]>
Signed-off-by: Yi Zhou <[email protected]>
Damenly pushed a commit to Damenly/linux that referenced this pull request Nov 13, 2022
commit a349b95 upstream.

This patch fixes an issue that the following error is
possible to happen when ohci hardware causes an interruption
and the system is shutting down at the same time.

[   34.851754] usb 2-1: USB disconnect, device number 2
[   35.166658] irq 156: nobody cared (try booting with the "irqpoll" option)
[   35.173445] CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 5.3.0-rc5 torvalds#85
[   35.179964] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
[   35.187886] Workqueue: usb_hub_wq hub_event
[   35.192063] Call trace:
[   35.194509]  dump_backtrace+0x0/0x150
[   35.198165]  show_stack+0x14/0x20
[   35.201475]  dump_stack+0xa0/0xc4
[   35.204785]  __report_bad_irq+0x34/0xe8
[   35.208614]  note_interrupt+0x2cc/0x318
[   35.212446]  handle_irq_event_percpu+0x5c/0x88
[   35.216883]  handle_irq_event+0x48/0x78
[   35.220712]  handle_fasteoi_irq+0xb4/0x188
[   35.224802]  generic_handle_irq+0x24/0x38
[   35.228804]  __handle_domain_irq+0x5c/0xb0
[   35.232893]  gic_handle_irq+0x58/0xa8
[   35.236548]  el1_irq+0xb8/0x180
[   35.239681]  __do_softirq+0x94/0x23c
[   35.243253]  irq_exit+0xd0/0xd8
[   35.246387]  __handle_domain_irq+0x60/0xb0
[   35.250475]  gic_handle_irq+0x58/0xa8
[   35.254130]  el1_irq+0xb8/0x180
[   35.257268]  kernfs_find_ns+0x5c/0x120
[   35.261010]  kernfs_find_and_get_ns+0x3c/0x60
[   35.265361]  sysfs_unmerge_group+0x20/0x68
[   35.269454]  dpm_sysfs_remove+0x2c/0x68
[   35.273284]  device_del+0x80/0x370
[   35.276683]  hid_destroy_device+0x28/0x60
[   35.280686]  usbhid_disconnect+0x4c/0x80
[   35.284602]  usb_unbind_interface+0x6c/0x268
[   35.288867]  device_release_driver_internal+0xe4/0x1b0
[   35.293998]  device_release_driver+0x14/0x20
[   35.298261]  bus_remove_device+0x110/0x128
[   35.302350]  device_del+0x148/0x370
[   35.305832]  usb_disable_device+0x8c/0x1d0
[   35.309921]  usb_disconnect+0xc8/0x2d0
[   35.313663]  hub_event+0x6e0/0x1128
[   35.317146]  process_one_work+0x1e0/0x320
[   35.321148]  worker_thread+0x40/0x450
[   35.324805]  kthread+0x124/0x128
[   35.328027]  ret_from_fork+0x10/0x18
[   35.331594] handlers:
[   35.333862] [<0000000079300c1d>] usb_hcd_irq
[   35.338126] [<0000000079300c1d>] usb_hcd_irq
[   35.342389] Disabling IRQ torvalds#156

ohci_shutdown() disables all the interrupt and rh_state is set to
OHCI_RH_HALTED. In other hand, ohci_irq() is possible to enable
OHCI_INTR_SF and OHCI_INTR_MIE on ohci_irq(). Note that OHCI_INTR_SF
is possible to be set by start_ed_unlink() which is called:
 ohci_irq()
  -> process_done_list()
   -> takeback_td()
    -> start_ed_unlink()

So, ohci_irq() has the following condition, the issue happens by
&ohci->regs->intrenable = OHCI_INTR_MIE | OHCI_INTR_SF and
ohci->rh_state = OHCI_RH_HALTED:

	/* interrupt for some other device? */
	if (ints == 0 || unlikely(ohci->rh_state == OHCI_RH_HALTED))
		return IRQ_NOTMINE;

To fix the issue, ohci_shutdown() holds the spin lock while disabling
the interruption and changing the rh_state flag to prevent reenable
the OHCI_INTR_MIE unexpectedly. Note that io_watchdog_func() also
calls the ohci_shutdown() and it already held the spin lock, so that
the patch makes a new function as _ohci_shutdown().

This patch is inspired by a Renesas R-Car Gen3 BSP patch
from Tho Vu.

Signed-off-by: Yoshihiro Shimoda <[email protected]>
Cc: stable <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/1566877910-6020-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <[email protected]>
@torvalds torvalds closed this Aug 2, 2023
gyohng pushed a commit to gyohng/linux-h616 that referenced this pull request May 15, 2024
Checking for PI/PP boosting mutex is not enough when dropping to
in-band context: owning any mutex in this case would be wrong, since
this would create a priority inversion.

Extend the logic of evl_detect_boost_drop() to encompass any owned
mutex, renaming it to evl_check_no_mutex() for consistency. As a
side-effect, the thread which attempts to switch in-band while owning
mutex(es) now receives a single HMDIAG_LKDEPEND notification, instead
of notifying all waiter(s) sleeping on those mutexes.

As a consequence, we can drop detect_inband_owner() which becomes
redundant as it detects the same issue from the converse side without
extending the test coverage (i.e. a contender would check whether the
mutex owner is running in-band).

This change does affect the behavior for applications turning on
T_WOLI on waiter threads explicitly. This said, the same issue would
still be detected if CONFIG_EVL_DEBUG_WOLI is set globally though,
which is the recommended configuration during the development stage.

This change also solves an ABBA issue which existed in the former
implementation:

[   40.976962] ======================================================
[   40.976964] WARNING: possible circular locking dependency detected
[   40.976965] 5.15.77-00716-g8390add2f766 torvalds#156 Not tainted
[   40.976968] ------------------------------------------------------
[   40.976969] monitor-pp-lazy/363 is trying to acquire lock:
[   40.976971] ffff99c5c14e5588 (test363.0){....}-{0:0}, at: evl_detect_boost_drop+0x80/0x200
[   40.976987]
[   40.976987] but task is already holding lock:
[   40.976988] ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200
[   40.976996]
[   40.976996] which lock already depends on the new lock.
[   40.976996]
[   40.976997]
[   40.976997] the existing dependency chain (in reverse order) is:
[   40.976998]
[   40.976998] -> #1 (monitor-pp-lazy:363){....}-{0:0}:
[   40.977003]        fast_grab_mutex+0xca/0x150
[   40.977006]        evl_lock_mutex_timeout+0x60/0xa90
[   40.977009]        monitor_oob_ioctl+0x226/0xed0
[   40.977014]        EVL_ioctl+0x41/0xa0
[   40.977017]        handle_pipelined_syscall+0x3d8/0x490
[   40.977021]        __pipeline_syscall+0xcc/0x2e0
[   40.977026]        pipeline_syscall+0x47/0x120
[   40.977030]        syscall_enter_from_user_mode+0x40/0xa0
[   40.977036]        do_syscall_64+0x15/0xf0
[   40.977039]        entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   40.977044]
[   40.977044] -> #0 (test363.0){....}-{0:0}:
[   40.977048]        __lock_acquire+0x133a/0x2530
[   40.977053]        lock_acquire+0xce/0x2d0
[   40.977056]        evl_detect_boost_drop+0xb0/0x200
[   40.977059]        evl_switch_inband+0x41e/0x540
[   40.977064]        do_oob_syscall+0x1bc/0x3d0
[   40.977067]        handle_pipelined_syscall+0xbe/0x490
[   40.977071]        __pipeline_syscall+0xcc/0x2e0
[   40.977075]        pipeline_syscall+0x47/0x120
[   40.977079]        syscall_enter_from_user_mode+0x40/0xa0
[   40.977083]        do_syscall_64+0x15/0xf0
[   40.977086]        entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   40.977090]
[   40.977090] other info that might help us debug this:
[   40.977090]
[   40.977091]  Possible unsafe locking scenario:
[   40.977091]
[   40.977092]        CPU0                    CPU1
[   40.977093]        ----                    ----
[   40.977094]   lock(monitor-pp-lazy:363);
[   40.977096]                                lock(test363.0);
[   40.977098]                                lock(monitor-pp-lazy:363);
[   40.977100]   lock(test363.0);
[   40.977102]
[   40.977102]  *** DEADLOCK ***
[   40.977102]
[   40.977103] 1 lock held by monitor-pp-lazy/363:
[   40.977105]  #0: ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200
[   40.977113]

Signed-off-by: Philippe Gerum <[email protected]>
gyohng pushed a commit to gyohng/linux-h616 that referenced this pull request Jun 3, 2024
Checking for PI/PP boosting mutex is not enough when dropping to
in-band context: owning any mutex in this case would be wrong, since
this would create a priority inversion.

Extend the logic of evl_detect_boost_drop() to encompass any owned
mutex, renaming it to evl_check_no_mutex() for consistency. As a
side-effect, the thread which attempts to switch in-band while owning
mutex(es) now receives a single HMDIAG_LKDEPEND notification, instead
of notifying all waiter(s) sleeping on those mutexes.

As a consequence, we can drop detect_inband_owner() which becomes
redundant as it detects the same issue from the converse side without
extending the test coverage (i.e. a contender would check whether the
mutex owner is running in-band).

This change does affect the behavior for applications turning on
T_WOLI on waiter threads explicitly. This said, the same issue would
still be detected if CONFIG_EVL_DEBUG_WOLI is set globally though,
which is the recommended configuration during the development stage.

This change also solves an ABBA issue which existed in the former
implementation:

[   40.976962] ======================================================
[   40.976964] WARNING: possible circular locking dependency detected
[   40.976965] 5.15.77-00716-g8390add2f766 torvalds#156 Not tainted
[   40.976968] ------------------------------------------------------
[   40.976969] monitor-pp-lazy/363 is trying to acquire lock:
[   40.976971] ffff99c5c14e5588 (test363.0){....}-{0:0}, at: evl_detect_boost_drop+0x80/0x200
[   40.976987]
[   40.976987] but task is already holding lock:
[   40.976988] ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200
[   40.976996]
[   40.976996] which lock already depends on the new lock.
[   40.976996]
[   40.976997]
[   40.976997] the existing dependency chain (in reverse order) is:
[   40.976998]
[   40.976998] -> #1 (monitor-pp-lazy:363){....}-{0:0}:
[   40.977003]        fast_grab_mutex+0xca/0x150
[   40.977006]        evl_lock_mutex_timeout+0x60/0xa90
[   40.977009]        monitor_oob_ioctl+0x226/0xed0
[   40.977014]        EVL_ioctl+0x41/0xa0
[   40.977017]        handle_pipelined_syscall+0x3d8/0x490
[   40.977021]        __pipeline_syscall+0xcc/0x2e0
[   40.977026]        pipeline_syscall+0x47/0x120
[   40.977030]        syscall_enter_from_user_mode+0x40/0xa0
[   40.977036]        do_syscall_64+0x15/0xf0
[   40.977039]        entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   40.977044]
[   40.977044] -> #0 (test363.0){....}-{0:0}:
[   40.977048]        __lock_acquire+0x133a/0x2530
[   40.977053]        lock_acquire+0xce/0x2d0
[   40.977056]        evl_detect_boost_drop+0xb0/0x200
[   40.977059]        evl_switch_inband+0x41e/0x540
[   40.977064]        do_oob_syscall+0x1bc/0x3d0
[   40.977067]        handle_pipelined_syscall+0xbe/0x490
[   40.977071]        __pipeline_syscall+0xcc/0x2e0
[   40.977075]        pipeline_syscall+0x47/0x120
[   40.977079]        syscall_enter_from_user_mode+0x40/0xa0
[   40.977083]        do_syscall_64+0x15/0xf0
[   40.977086]        entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   40.977090]
[   40.977090] other info that might help us debug this:
[   40.977090]
[   40.977091]  Possible unsafe locking scenario:
[   40.977091]
[   40.977092]        CPU0                    CPU1
[   40.977093]        ----                    ----
[   40.977094]   lock(monitor-pp-lazy:363);
[   40.977096]                                lock(test363.0);
[   40.977098]                                lock(monitor-pp-lazy:363);
[   40.977100]   lock(test363.0);
[   40.977102]
[   40.977102]  *** DEADLOCK ***
[   40.977102]
[   40.977103] 1 lock held by monitor-pp-lazy/363:
[   40.977105]  #0: ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200
[   40.977113]

Signed-off-by: Philippe Gerum <[email protected]>
gyohng pushed a commit to gyohng/linux-h616 that referenced this pull request Oct 1, 2024
Checking for PI/PP boosting mutex is not enough when dropping to
in-band context: owning any mutex in this case would be wrong, since
this would create a priority inversion.

Extend the logic of evl_detect_boost_drop() to encompass any owned
mutex, renaming it to evl_check_no_mutex() for consistency. As a
side-effect, the thread which attempts to switch in-band while owning
mutex(es) now receives a single HMDIAG_LKDEPEND notification, instead
of notifying all waiter(s) sleeping on those mutexes.

As a consequence, we can drop detect_inband_owner() which becomes
redundant as it detects the same issue from the converse side without
extending the test coverage (i.e. a contender would check whether the
mutex owner is running in-band).

This change does affect the behavior for applications turning on
T_WOLI on waiter threads explicitly. This said, the same issue would
still be detected if CONFIG_EVL_DEBUG_WOLI is set globally though,
which is the recommended configuration during the development stage.

This change also solves an ABBA issue which existed in the former
implementation:

[   40.976962] ======================================================
[   40.976964] WARNING: possible circular locking dependency detected
[   40.976965] 5.15.77-00716-g8390add2f766 torvalds#156 Not tainted
[   40.976968] ------------------------------------------------------
[   40.976969] monitor-pp-lazy/363 is trying to acquire lock:
[   40.976971] ffff99c5c14e5588 (test363.0){....}-{0:0}, at: evl_detect_boost_drop+0x80/0x200
[   40.976987]
[   40.976987] but task is already holding lock:
[   40.976988] ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200
[   40.976996]
[   40.976996] which lock already depends on the new lock.
[   40.976996]
[   40.976997]
[   40.976997] the existing dependency chain (in reverse order) is:
[   40.976998]
[   40.976998] -> #1 (monitor-pp-lazy:363){....}-{0:0}:
[   40.977003]        fast_grab_mutex+0xca/0x150
[   40.977006]        evl_lock_mutex_timeout+0x60/0xa90
[   40.977009]        monitor_oob_ioctl+0x226/0xed0
[   40.977014]        EVL_ioctl+0x41/0xa0
[   40.977017]        handle_pipelined_syscall+0x3d8/0x490
[   40.977021]        __pipeline_syscall+0xcc/0x2e0
[   40.977026]        pipeline_syscall+0x47/0x120
[   40.977030]        syscall_enter_from_user_mode+0x40/0xa0
[   40.977036]        do_syscall_64+0x15/0xf0
[   40.977039]        entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   40.977044]
[   40.977044] -> #0 (test363.0){....}-{0:0}:
[   40.977048]        __lock_acquire+0x133a/0x2530
[   40.977053]        lock_acquire+0xce/0x2d0
[   40.977056]        evl_detect_boost_drop+0xb0/0x200
[   40.977059]        evl_switch_inband+0x41e/0x540
[   40.977064]        do_oob_syscall+0x1bc/0x3d0
[   40.977067]        handle_pipelined_syscall+0xbe/0x490
[   40.977071]        __pipeline_syscall+0xcc/0x2e0
[   40.977075]        pipeline_syscall+0x47/0x120
[   40.977079]        syscall_enter_from_user_mode+0x40/0xa0
[   40.977083]        do_syscall_64+0x15/0xf0
[   40.977086]        entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   40.977090]
[   40.977090] other info that might help us debug this:
[   40.977090]
[   40.977091]  Possible unsafe locking scenario:
[   40.977091]
[   40.977092]        CPU0                    CPU1
[   40.977093]        ----                    ----
[   40.977094]   lock(monitor-pp-lazy:363);
[   40.977096]                                lock(test363.0);
[   40.977098]                                lock(monitor-pp-lazy:363);
[   40.977100]   lock(test363.0);
[   40.977102]
[   40.977102]  *** DEADLOCK ***
[   40.977102]
[   40.977103] 1 lock held by monitor-pp-lazy/363:
[   40.977105]  #0: ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200
[   40.977113]

Signed-off-by: Philippe Gerum <[email protected]>
gyohng pushed a commit to gyohng/linux-h616 that referenced this pull request Oct 3, 2024
Checking for PI/PP boosting mutex is not enough when dropping to
in-band context: owning any mutex in this case would be wrong, since
this would create a priority inversion.

Extend the logic of evl_detect_boost_drop() to encompass any owned
mutex, renaming it to evl_check_no_mutex() for consistency. As a
side-effect, the thread which attempts to switch in-band while owning
mutex(es) now receives a single HMDIAG_LKDEPEND notification, instead
of notifying all waiter(s) sleeping on those mutexes.

As a consequence, we can drop detect_inband_owner() which becomes
redundant as it detects the same issue from the converse side without
extending the test coverage (i.e. a contender would check whether the
mutex owner is running in-band).

This change does affect the behavior for applications turning on
T_WOLI on waiter threads explicitly. This said, the same issue would
still be detected if CONFIG_EVL_DEBUG_WOLI is set globally though,
which is the recommended configuration during the development stage.

This change also solves an ABBA issue which existed in the former
implementation:

[   40.976962] ======================================================
[   40.976964] WARNING: possible circular locking dependency detected
[   40.976965] 5.15.77-00716-g8390add2f766 torvalds#156 Not tainted
[   40.976968] ------------------------------------------------------
[   40.976969] monitor-pp-lazy/363 is trying to acquire lock:
[   40.976971] ffff99c5c14e5588 (test363.0){....}-{0:0}, at: evl_detect_boost_drop+0x80/0x200
[   40.976987]
[   40.976987] but task is already holding lock:
[   40.976988] ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200
[   40.976996]
[   40.976996] which lock already depends on the new lock.
[   40.976996]
[   40.976997]
[   40.976997] the existing dependency chain (in reverse order) is:
[   40.976998]
[   40.976998] -> #1 (monitor-pp-lazy:363){....}-{0:0}:
[   40.977003]        fast_grab_mutex+0xca/0x150
[   40.977006]        evl_lock_mutex_timeout+0x60/0xa90
[   40.977009]        monitor_oob_ioctl+0x226/0xed0
[   40.977014]        EVL_ioctl+0x41/0xa0
[   40.977017]        handle_pipelined_syscall+0x3d8/0x490
[   40.977021]        __pipeline_syscall+0xcc/0x2e0
[   40.977026]        pipeline_syscall+0x47/0x120
[   40.977030]        syscall_enter_from_user_mode+0x40/0xa0
[   40.977036]        do_syscall_64+0x15/0xf0
[   40.977039]        entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   40.977044]
[   40.977044] -> #0 (test363.0){....}-{0:0}:
[   40.977048]        __lock_acquire+0x133a/0x2530
[   40.977053]        lock_acquire+0xce/0x2d0
[   40.977056]        evl_detect_boost_drop+0xb0/0x200
[   40.977059]        evl_switch_inband+0x41e/0x540
[   40.977064]        do_oob_syscall+0x1bc/0x3d0
[   40.977067]        handle_pipelined_syscall+0xbe/0x490
[   40.977071]        __pipeline_syscall+0xcc/0x2e0
[   40.977075]        pipeline_syscall+0x47/0x120
[   40.977079]        syscall_enter_from_user_mode+0x40/0xa0
[   40.977083]        do_syscall_64+0x15/0xf0
[   40.977086]        entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   40.977090]
[   40.977090] other info that might help us debug this:
[   40.977090]
[   40.977091]  Possible unsafe locking scenario:
[   40.977091]
[   40.977092]        CPU0                    CPU1
[   40.977093]        ----                    ----
[   40.977094]   lock(monitor-pp-lazy:363);
[   40.977096]                                lock(test363.0);
[   40.977098]                                lock(monitor-pp-lazy:363);
[   40.977100]   lock(test363.0);
[   40.977102]
[   40.977102]  *** DEADLOCK ***
[   40.977102]
[   40.977103] 1 lock held by monitor-pp-lazy/363:
[   40.977105]  #0: ffff99c5c243d818 (monitor-pp-lazy:363){....}-{0:0}, at: evl_detect_boost_drop+0x0/0x200
[   40.977113]

Signed-off-by: Philippe Gerum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.